<div>In this case, another measure of diversity is better: <a href="https://www.quora.com/Which-is-the-best-measure-of-uncertainty-variance-or-entropy-or-are-they-both-equivalent/answer/Suresh-Kondamudi" target="_blank"><b>entropy</b></a>.&nbsp;</div><div>To see why, it helps looking at the simplified case when data labels are discrete (like in classification situations). In this case, entropy is higher when data is spread in a lot of categories: for <i>N</i> equi-distributed categories, the entropy is equal to log(<i>N</i>), which is an increasing function of <i>N</i>.</div><div>This reasoning can be generalized to our setting, where data  lives in a continuous and high-dimensional space, using&nbsp;<a href="https://en.m.wikipedia.org/wiki/Limiting_density_of_discrete_points" target="_blank">differential entropy</a>.</div>